DOOML: add compiler doing everything for A (well, we hope to add everything)#50
DOOML: add compiler doing everything for A (well, we hope to add everything)#50Kakadu merged 58 commits intoKakadu:masterfrom
Conversation
This is quasi-initial commit of the DOOML language compiler done in terms of functional language compilers 2025 SPbU course. Basically, it is OCaml-like language with (drastically) truncated list of features. This patch introduces a basic AST and parser with pretty-printing facilities. Only integers, units and tuples are supported yet. This will be implemented by Georgiy Belyanin and Ignatiy Sergeev.
This patch introduces a-normal-form (ANF) as a step in the compiler's middle-end. ANF is responsible to make the function calls only contain variables or immediate values as their arguments which makes the code kind of resemble assembly in the sense that assembler instructions usually work with registers/immediate values not with compound statements. Example: ```ocaml (* Before ANF *) let f = let q = f ((g + sup0) * (2 * i)) in q ;; (* After ANF *) let f = let sup2 = (*) 2 i in let sup5 = (+) g sup0 in let sup6 = (*) sup5 sup2 in let sup7 = (f) sup6 in let q = sup7 in q ;; ```
This patch fixes AST pretty-printing by properly using format boxes. It
is relatively hard to describe what has changed since the previous usage
was completely wrong. Let's provide a few examples instead to notice the
difference between the old formatting and the new one.
Old:
```
let a = 15 in let b = 4 in
let c = 8 in ...
```
New:
```
let a = 15 in
let b = 4 in
let c = 8 in
...
```
Old:
```
let smth = if long_cond then long_one else long_two in
```
New:
```
let smth = if long_cond then
long_one
else
long_two
in
```
Previous implementation of the if-then-else ANF process was wrong. It actually executed both then and else branches and then chosen one of the results. This patch fixes it and now both branches are executed exclusively.
This patch prevents the parser from handling the keywords as identifiers.
This patch introduces plugs, tuples, and units into the parser. In other words, the following code can now be parsed properly. ```ocaml let () = _;; let _ = _;; let (a, b) = _;; ```
This patch fixes the if-then-else (ite) condition parsing. The problem was that it was impossible to use ite inside ite conditions (for some unknown reason). The problem could be fixed by removing a few parser-combinator commits. Let's do it even though it would make errors less useful. Actually, the patch improves pretty-printing too.
This patch significantly improves the ANF stage. It does the following. * It makes ANF accept the whole program as input. * It improves pretty-printing (similarly to AST in the previous patch). * It allows to ANF tuples as follows. Before ANF. ``` let (a, b) = c;; ``` After ANF. ``` let a = nth c 0;; let b = nth c 1;; ``` It also makes it internally use a state monad.
This patch introduces basic closure-conversion and lambda-lifting allowing DOOML to handle closures. They are usually used together so they are added in a single patch. ```ocaml let f = fun a c -> let g = fun b -> a + b in g c ;; (* CC turns it into... *) let f = fun a c -> let g = (fun a b -> a + b) a in g c ;; (* LL turns it into... *) let g = (fun a b -> a + b) a;; let f = fun a c -> g c ;; ```
This patch introduces basic DOOML compilation into RISC-V assembly, namely rv64gc. The patch does not yet verify it. However, a few tests are about to be added soon. Additionally, there is C runtime. It must be compiled by cross-toolchain and linked across the generated assembly in order for everything to work properly.
8d3edd1 to
f23b6c7
Compare
Let's try to cleanup everything...
f23b6c7 to
b138e2c
Compare
7d81b41 to
719eb56
Compare
| @@ -0,0 +1,431 @@ | |||
| open Angstrom | |||
There was a problem hiding this comment.
File 'DOOML/lib/fe.ml' doesn't have corresponding .mli interface
| open DOOML | ||
| module Map = Base.Map.Poly | ||
|
|
||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| open DOOML | ||
| module Map = Base.Map.Poly | ||
|
|
||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| @@ -0,0 +1,427 @@ | |||
| type immexpr = | |||
There was a problem hiding this comment.
OCaml files should provide license information in second line (structure item)
| cond_ | ||
| in | ||
| return ret | ||
| | Fun _ -> failwith "should be CC/LL first" |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| | Etc of string | ||
| [@@deriving variants] | ||
|
|
||
| let todo () = failwith "todo" |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| open State | ||
|
|
||
| let spf = Format.asprintf | ||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| | ">=" -> [ slt rd rs1 rs2; xori rd rd 1 ] | ||
| | "=" -> [ sub (temp 0) rs1 rs2; seqz rd (temp 0) ] | ||
| | "<>" -> [ sub (temp 0) rs1 rs2; snez rd (temp 0) ] | ||
| | _ -> assert false |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
DOOML/lib/riscv.ml
Outdated
| in | ||
| let argc = List.length args in | ||
| (match arity with | ||
| | `Fun arity when arity == argc -> |
| @@ -0,0 +1,19 @@ | |||
| module M (S : sig | |||
There was a problem hiding this comment.
OCaml files should provide license information in second line (structure item)
719eb56 to
37222fb
Compare
| @@ -0,0 +1,431 @@ | |||
| open Angstrom | |||
There was a problem hiding this comment.
File 'DOOML/lib/fe.ml' doesn't have corresponding .mli interface
| open DOOML | ||
| module Map = Base.Map.Poly | ||
|
|
||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| open DOOML | ||
| module Map = Base.Map.Poly | ||
|
|
||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| @@ -0,0 +1,427 @@ | |||
| type immexpr = | |||
There was a problem hiding this comment.
OCaml files should provide license information in second line (structure item)
| cond_ | ||
| in | ||
| return ret | ||
| | Fun _ -> failwith "should be CC/LL first" |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| | Etc of string | ||
| [@@deriving variants] | ||
|
|
||
| let todo () = failwith "todo" |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| open State | ||
|
|
||
| let spf = Format.asprintf | ||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| | ">=" -> [ slt rd rs1 rs2; xori rd rd 1 ] | ||
| | "=" -> [ sub (temp 0) rs1 rs2; seqz rd (temp 0) ] | ||
| | "<>" -> [ sub (temp 0) rs1 rs2; snez rd (temp 0) ] | ||
| | _ -> assert false |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
DOOML/lib/riscv.ml
Outdated
| in | ||
| let argc = List.length args in | ||
| (match arity with | ||
| | `Fun arity when arity == argc -> |
| @@ -0,0 +1,19 @@ | |||
| module M (S : sig | |||
There was a problem hiding this comment.
OCaml files should provide license information in second line (structure item)
|
Документация и тестовое покрытие (75.88%) должны скоро появиться. https://kakadu.github.io/comp25/docs/DOOML https://kakadu.github.io/comp25/cov/DOOML 2026-03-16 18:23
|
| @@ -0,0 +1,120 @@ | |||
| [@@@ocaml.text "/*"] | |||
There was a problem hiding this comment.
File 'DOOML/lib/llvm_wrapper.ml' doesn't have corresponding .mli interface
| open DOOML | ||
| module Map = Base.Map.Poly | ||
|
|
||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| open DOOML | ||
| module Map = Base.Map.Poly | ||
|
|
||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| cond_ | ||
| in | ||
| return ret | ||
| | Fun _ -> failwith "should be CC/LL first" |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
|
|
||
| open (val Llvm_wrapper.make context builder the_module) | ||
|
|
||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| ctx'.lifts, ctx', ast | ||
| in | ||
| ctx := reset ctx'; | ||
| ctx := extend pattern !ctx |> snd; |
There was a problem hiding this comment.
Using mutable data structures for teaching purposes is usually discouraged. Replace Hashtables by standard tree-like maps or consider Hash-Array Mapped Tries (HAMT). Use mutable references and mutable structure fields only if it is really required. In all places where it is needed indeed, describe in a comment why it is needed there.
| ctx'.lifts, ctx', ast | ||
| in | ||
| ctx := reset ctx'; | ||
| ctx := extend pattern !ctx |> snd; |
There was a problem hiding this comment.
Using mutable data structures for teaching purposes is usually discouraged. Replace Hashtables by standard tree-like maps or consider Hash-Array Mapped Tries (HAMT). Use mutable references and mutable structure fields only if it is really required. In all places where it is needed indeed, describe in a comment why it is needed there.
| | Etc of string | ||
| [@@deriving variants] | ||
|
|
||
| let todo () = failwith "todo" |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| open State | ||
|
|
||
| let spf = Format.asprintf | ||
| let failf fmt = Format.kasprintf failwith fmt |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
| | ">=" -> [ slt rd rs1 rs2; xori rd rd 1 ] | ||
| | "=" -> [ sub (temp 0) rs1 rs2; seqz rd (temp 0) ] | ||
| | "<>" -> [ sub (temp 0) rs1 rs2; snez rd (temp 0) ] | ||
| | _ -> assert false |
There was a problem hiding this comment.
Using failwith (or assert false) usually is a clue that a corner case is not being handled properly. To report errors we recommend using error monad instead. In princliple, these construction are OK for temporary work-in-progress code, but in release they should be eliminated
|
Linter report from 2026-03-16 18:24, for mini language DOOML |
|
Давайте сделаем вид, что тут всё достаточно хорошо для ТПшек. |
This PR introduces a complete DOOML complier implementation. It was made by Georgiy Belyanin (@georgiy-belyanin) and Ignat Sergeev (@IgnatSergeev).
What features were implemented?
CC/LL.ANF.